332 research outputs found
Understanding the Session Durability in Peer-to-Peer Storage System
This paper emphasizes that instead of long-term availability and reliability, the short-term session durability analysis will greatly impact the design of the real large-scale Peer-to-Peer storage system. In this paper, we use a Markov chain to model the session durability, and then derive the session durability probability distribution. Subsequently, we show the difference between our analysis and the traditional Mean Time to Failure (MTTF) analysis, from which we conclude that the misuse of MTTF analysis will greatly mislead our understanding of the session durability. We further show the impact of session durability analysis on the real system design. To our best knowledge, this is the first time ever to discuss the effects of session durability in large-scale Peer-to-Peer storage system.Computer Science, Theory & MethodsSCI(E)EICPCI-S(ISTP)
Molecular Model of Dynamic Social Network Based on E-mail communication
In this work we consider an application of physically inspired sociodynamical model to the modelling of the evolution of email-based social network. Contrary to the standard approach of sociodynamics, which assumes expressing of system dynamics with heuristically defined simple rules, we postulate the inference of these rules from the real data and their application within a dynamic molecular model. We present how to embed the n-dimensional social space in Euclidean one. Then, inspired by the Lennard-Jones potential, we define a data-driven social potential function and apply the resultant force to a real e-mail communication network in a course of a molecular simulation, with network nodes taking on the role of interacting particles. We discuss all steps of the modelling process, from data preparation, through embedding and the molecular simulation itself, to transformation from the embedding space back to a graph structure. The conclusions, drawn from examining the resultant networks in stable, minimum-energy states, emphasize the role of the embedding process projecting the non–metric social graph into the Euclidean space, the significance of the unavoidable loss of information connected with this procedure and the resultant preservation of global rather than local properties of the initial network. We also argue applicability of our method to some classes of problems, while also signalling the areas which require further research in order to expand this applicability domain
Experimental evaluation of train and test split strategies in link prediction
In link prediction, the goal is to predict which links will appear in the future of an evolving network. To estimate the performance of these models in a supervised machine learning model, disjoint and independent train and test sets are needed. However, objects in a real-world network are inherently related to each other. Therefore, it is far from trivial to separate candidate links into these disjoint sets.Here we characterize and empirically investigate the two dominant approaches from the literature for creating separate train and test sets in link prediction, referred to as random and temporal splits. Comparing the performance of these two approaches on several large temporal network datasets, we find evidence that random splits may result in too optimistic results, whereas a temporal split may give a more fair and realistic indication of performance. Results appear robust to the selection of temporal intervals. These findings will be of interest to researchers that employ link prediction or other machine learning tasks in networks.Computer Systems, Imagery and Medi
Navigability is a Robust Property
The Small World phenomenon has inspired researchers across a number of
fields. A breakthrough in its understanding was made by Kleinberg who
introduced Rank Based Augmentation (RBA): add to each vertex independently an
arc to a random destination selected from a carefully crafted probability
distribution. Kleinberg proved that RBA makes many networks navigable, i.e., it
allows greedy routing to successfully deliver messages between any two vertices
in a polylogarithmic number of steps. We prove that navigability is an inherent
property of many random networks, arising without coordination, or even
independence assumptions
The evolution of interdisciplinarity in physics research
Science, being a social enterprise, is subject to fragmentation into groups
that focus on specialized areas or topics. Often new advances occur through
cross-fertilization of ideas between sub-fields that otherwise have little
overlap as they study dissimilar phenomena using different techniques. Thus to
explore the nature and dynamics of scientific progress one needs to consider
the large-scale organization and interactions between different subject areas.
Here, we study the relationships between the sub-fields of Physics using the
Physics and Astronomy Classification Scheme (PACS) codes employed for
self-categorization of articles published over the past 25 years (1985-2009).
We observe a clear trend towards increasing interactions between the different
sub-fields. The network of sub-fields also exhibits core-periphery
organization, the nucleus being dominated by Condensed Matter and General
Physics. However, over time Interdisciplinary Physics is steadily increasing
its share in the network core, reflecting a shift in the overall trend of
Physics research.Comment: Published version, 10 pages, 8 figures + Supplementary Informatio
Learning to Infer Social Ties in Large Networks
Abstract. In online social networks, most relationships are lack of meaning labels (e.g., “colleague ” and “intimate friends”), simply because users do not take the time to label them. An interesting question is: can we automatically infer the type of social relationships in a large network? what are the fundamental factors that imply the type of social relation-ships? In this work, we formalize the problem of social relationship learn-ing into a semi-supervised framework, and propose a Partially-labeled Pairwise Factor Graph Model (PLP-FGM) for learning to infer the type of social ties. We tested the model on three different genres of data sets: Publication, Email and Mobile. Experimental results demonstrate that the proposed PLP-FGM model can accurately infer 92.7 % of advisor-advisee relationships from the coauthor network (Publication), 88.0 % of manager-subordinate relationships from the email network (Email), and 83.1 % of the friendships from the mobile network (Mobile). Finally, we develop a distributed learning algorithm to scale up the model to real large networks.
A Game Theoretic Model for the Formation of Navigable Small-World Networks
Kleinberg proposed a family of small-world networks to ex-plain the navigability of large-scale real-world social net-works. However, the underlying mechanism that drives real networks to be navigable is not yet well understood. In this paper, we present a game theoretic model for the for-mation of navigable small world networks. We model the network formation as a game in which people seek for both high reciprocity and long-distance relationships. We show that the navigable small-world network is a Nash Equilib-rium of the game. Moreover, we prove that the navigable small-world equilibrium tolerates collusions of any size and arbitrary deviations of a large random set of nodes, while non-navigable equilibria do not tolerate small group collu-sions or random perturbations. Our empirical evaluation further demonstrates that the system always converges to the navigable network even when limited or no information about other players ’ strategies is available. Our theoretical and empirical analyses provide important new insight on the connection between distance, reciprocity and navigability in social networks
Risk-Averse Matchings over Uncertain Graph Databases
A large number of applications such as querying sensor networks, and
analyzing protein-protein interaction (PPI) networks, rely on mining uncertain
graph and hypergraph databases. In this work we study the following problem:
given an uncertain, weighted (hyper)graph, how can we efficiently find a
(hyper)matching with high expected reward, and low risk?
This problem naturally arises in the context of several important
applications, such as online dating, kidney exchanges, and team formation. We
introduce a novel formulation for finding matchings with maximum expected
reward and bounded risk under a general model of uncertain weighted
(hyper)graphs that we introduce in this work. Our model generalizes
probabilistic models used in prior work, and captures both continuous and
discrete probability distributions, thus allowing to handle privacy related
applications that inject appropriately distributed noise to (hyper)edge
weights. Given that our optimization problem is NP-hard, we turn our attention
to designing efficient approximation algorithms. For the case of uncertain
weighted graphs, we provide a -approximation algorithm, and a
-approximation algorithm with near optimal run time. For the case
of uncertain weighted hypergraphs, we provide a
-approximation algorithm, where is the rank of the
hypergraph (i.e., any hyperedge includes at most nodes), that runs in
almost (modulo log factors) linear time.
We complement our theoretical results by testing our approximation algorithms
on a wide variety of synthetic experiments, where we observe in a controlled
setting interesting findings on the trade-off between reward, and risk. We also
provide an application of our formulation for providing recommendations of
teams that are likely to collaborate, and have high impact.Comment: 25 page
Gender Detection on Social Networks using Ensemble Deep Learning
Analyzing the ever-increasing volume of posts on social media sites such as
Facebook and Twitter requires improved information processing methods for
profiling authorship. Document classification is central to this task, but the
performance of traditional supervised classifiers has degraded as the volume of
social media has increased. This paper addresses this problem in the context of
gender detection through ensemble classification that employs multi-model deep
learning architectures to generate specialized understanding from different
feature spaces
From Relational Data to Graphs: Inferring Significant Links using Generalized Hypergeometric Ensembles
The inference of network topologies from relational data is an important
problem in data analysis. Exemplary applications include the reconstruction of
social ties from data on human interactions, the inference of gene
co-expression networks from DNA microarray data, or the learning of semantic
relationships based on co-occurrences of words in documents. Solving these
problems requires techniques to infer significant links in noisy relational
data. In this short paper, we propose a new statistical modeling framework to
address this challenge. It builds on generalized hypergeometric ensembles, a
class of generative stochastic models that give rise to analytically tractable
probability spaces of directed, multi-edge graphs. We show how this framework
can be used to assess the significance of links in noisy relational data. We
illustrate our method in two data sets capturing spatio-temporal proximity
relations between actors in a social system. The results show that our
analytical framework provides a new approach to infer significant links from
relational data, with interesting perspectives for the mining of data on social
systems.Comment: 10 pages, 8 figures, accepted at SocInfo201
- …